{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Tce3stUlHN0L"
      },
      "source": [
        "##### Copyright 2020 The TensorFlow Authors."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "cellView": "form",
        "id": "IcfrhafzkZbH"
      },
      "outputs": [],
      "source": [
        "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qFdPvlXBOdUN"
      },
      "source": [
        "# 量化感知训练综合指南"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "MfBg1C5NB3X0"
      },
      "source": [
        "<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
        "  <td><a target=\"_blank\" href=\"https://tensorflow.google.cn/model_optimization/guide/quantization/training_comprehensive_guide\"><img src=\"https://tensorflow.google.cn/images/tf_logo_32px.png\">在 TensorFlow.org 上查看 </a></td>\n",
        "  <td><a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/docs-l10n/blob/master/site/zh-cn/model_optimization/guide/quantization/training_comprehensive_guide.ipynb\"><img src=\"https://tensorflow.google.cn/images/colab_logo_32px.png\">在 Google Colab 中运行 </a></td>\n",
        "  <td><a target=\"_blank\" href=\"https://github.com/tensorflow/docs-l10n/blob/master/site/zh-cn/model_optimization/guide/quantization/training_comprehensive_guide.ipynb\"><img src=\"https://tensorflow.google.cn/images/GitHub-Mark-32px.png\">在 GitHub 中查看源代码</a></td>\n",
        "  <td><a href=\"https://storage.googleapis.com/tensorflow_docs/docs-l10n/site/zh-cn/model_optimization/guide/quantization/training_comprehensive_guide.ipynb\"><img src=\"https://tensorflow.google.cn/images/download_logo_32px.png\">下载笔记本</a></td>\n",
        "</table>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "FbORZA_bQx1G"
      },
      "source": [
        "欢迎阅读 Keras 量化感知训练的综合指南。\n",
        "\n",
        "本页面记录了各种用例，并展示了如何将 API 用于每种用例​​。了解需要哪些 API 后，可在 [API 文档](https://tensorflow.google.cn/model_optimization/api_docs/python/tfmot/quantization)中找到参数和底层详细信息：\n",
        "\n",
        "- 如果要查看量化感知训练的好处以及支持的功能，请参阅[概述](https://tensorflow.google.cn/model_optimization/guide/quantization/training.md)。\n",
        "- 有关单个端到端示例，请参阅[量化感知训练示例](https://tensorflow.google.cn/model_optimization/guide/quantization/training_example.md)。\n",
        "\n",
        "涵盖了以下用例：\n",
        "\n",
        "- 按下列步骤操作，部署 8 位量化模型。\n",
        "    - 定义一个量化感知模型。\n",
        "    - 仅对于 Keras HDF5 模型，使用特殊的检查点和反序列化逻辑。否则，将使用标准训练。\n",
        "    - 通过量化感知模型创建量化模型。\n",
        "- 试验量化。\n",
        "    - 实验的任何方面都没有支持的部署路径。\n",
        "    - 自定义 Keras 层处于实验阶段。"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "nuABqZnXVDvO"
      },
      "source": [
        "## 设置"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qqnbd7TOfAq9"
      },
      "source": [
        "如果只是查找您需要的 API 并了解其用途，您可以运行但不阅读本部分。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "cellView": "both",
        "id": "lvpH1Hg7ULFz"
      },
      "outputs": [],
      "source": [
        "! pip uninstall -y tensorflow\n",
        "! pip install -q tf-nightly\n",
        "! pip install -q tensorflow-model-optimization\n",
        "\n",
        "import tensorflow as tf\n",
        "import numpy as np\n",
        "import tensorflow_model_optimization as tfmot\n",
        "\n",
        "import tempfile\n",
        "\n",
        "input_shape = [20]\n",
        "x_train = np.random.randn(1, 20).astype(np.float32)\n",
        "y_train = tf.keras.utils.to_categorical(np.random.randn(1), num_classes=20)\n",
        "\n",
        "def setup_model():\n",
        "  model = tf.keras.Sequential([\n",
        "      tf.keras.layers.Dense(20, input_shape=input_shape),\n",
        "      tf.keras.layers.Flatten()\n",
        "  ])\n",
        "  return model\n",
        "\n",
        "def setup_pretrained_weights():\n",
        "  model= setup_model()\n",
        "\n",
        "  model.compile(\n",
        "      loss=tf.keras.losses.categorical_crossentropy,\n",
        "      optimizer='adam',\n",
        "      metrics=['accuracy']\n",
        "  )\n",
        "\n",
        "  model.fit(x_train, y_train)\n",
        "\n",
        "  _, pretrained_weights = tempfile.mkstemp('.tf')\n",
        "\n",
        "  model.save_weights(pretrained_weights)\n",
        "\n",
        "  return pretrained_weights\n",
        "\n",
        "def setup_pretrained_model():\n",
        "  model = setup_model()\n",
        "  pretrained_weights = setup_pretrained_weights()\n",
        "  model.load_weights(pretrained_weights)\n",
        "  return model\n",
        "\n",
        "setup_model()\n",
        "pretrained_weights = setup_pretrained_weights()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "dTHLMLV-ZrUA"
      },
      "source": [
        "##定义量化感知模型"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "0U6XAUhIe6re"
      },
      "source": [
        "通过按以下方式定义模型，可以获得[概述](https://tensorflow.google.cn/model_optimization/guide/quantization/training.md)页面中所列后端的部署路径。默认情况下，使用 8 位量化。\n",
        "\n",
        "注：量化感知模型实际上并未量化。创建量化模型是一个单独的步骤。"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Ybigft1fTn4T"
      },
      "source": [
        "### 量化整个模型"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "puZvqnp1xsn-"
      },
      "source": [
        "**您的用例：**\n",
        "\n",
        "- 不支持子类化模型。\n",
        "\n",
        "**提高模型准确率的提示：**\n",
        "\n",
        "- 尝试“量化某些层”以跳过量化对准确率影响最大的层。\n",
        "- 与从头开始训练相比，使用量化感知训练进行微调的效果一般更好。\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "_Zhzx_azO1WR"
      },
      "source": [
        "要使整个模型可以感知量化，请将 `tfmot.quantization.keras.quantize_model` 应用于模型。\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "1s_EK8reOruu"
      },
      "outputs": [],
      "source": [
        "base_model = setup_model()\n",
        "base_model.load_weights(pretrained_weights) # optional but recommended for model accuracy\n",
        "\n",
        "quant_aware_model = tfmot.quantization.keras.quantize_model(base_model)\n",
        "quant_aware_model.summary()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "xTbTLn3dZM7h"
      },
      "source": [
        "### 量化某些层"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "MbM8o832xTxV"
      },
      "source": [
        "量化模型可能会对准确率造成负面影响。您可以选择性地量化模型的各个层来探索准确率、速度和模型大小之间的最佳平衡。\n",
        "\n",
        "**您的用例：**\n",
        "\n",
        "- 要部署到仅适用于完全量化模型（例如 EdgeTPU v1、大多数 DSP）的后端，请尝试“量化整个模型”。\n",
        "\n",
        "**提高模型准确率的提示：**\n",
        "\n",
        "- 与从头开始训练相比，使用量化感知训练进行微调的效果一般更好。\n",
        "- 尝试量化后面的层而不是前面的层。\n",
        "- 避免量化关键层（例如注意力机制）。\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "3OCbOUWHsE_v"
      },
      "source": [
        "在下面的示例中，仅量化 `Dense` 层。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "HN0B_QB-ZhE2"
      },
      "outputs": [],
      "source": [
        "# Create a base model\n",
        "base_model = setup_model()\n",
        "base_model.load_weights(pretrained_weights) # optional but recommended for model accuracy\n",
        "\n",
        "# Helper function uses `quantize_annotate_layer` to annotate that only the \n",
        "# Dense layers should be quantized.\n",
        "def apply_quantization_to_dense(layer):\n",
        "  if isinstance(layer, tf.keras.layers.Dense):\n",
        "    return tfmot.quantization.keras.quantize_annotate_layer(layer)\n",
        "  return layer\n",
        "\n",
        "# Use `tf.keras.models.clone_model` to apply `apply_quantization_to_dense` \n",
        "# to the layers of the model.\n",
        "annotated_model = tf.keras.models.clone_model(\n",
        "    base_model,\n",
        "    clone_function=apply_quantization_to_dense,\n",
        ")\n",
        "\n",
        "# Now that the Dense layers are annotated,\n",
        "# `quantize_apply` actually makes the model quantization aware.\n",
        "quant_aware_model = tfmot.quantization.keras.quantize_apply(annotated_model)\n",
        "quant_aware_model.summary()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "HiA28PrrW11H"
      },
      "source": [
        "尽管此示例使用层的类型来决定要量化的内容，但是量化特定层的最简单方式是设置其 `name` 属性，然后在 `clone_function` 中查找该名称。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "CjY_JyB808Da"
      },
      "outputs": [],
      "source": [
        "print(base_model.layers[0].name)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "mpb_BydRaSoF"
      },
      "source": [
        "#### 更具可读性，但模型准确率可能较低"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "2vqXeYffzSHp"
      },
      "source": [
        "这与通过量化感知训练进行的微调不兼容，这就是它的准确率可能低于上述示例的原因。"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "MQoMH3g3fWwb"
      },
      "source": [
        "**函数式模型示例**"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "7Wow55hg5oiM"
      },
      "outputs": [],
      "source": [
        "# Use `quantize_annotate_layer` to annotate that the `Dense` layer\n",
        "# should be quantized.\n",
        "i = tf.keras.Input(shape=(20,))\n",
        "x = tfmot.quantization.keras.quantize_annotate_layer(tf.keras.layers.Dense(10))(i)\n",
        "o = tf.keras.layers.Flatten()(x)\n",
        "annotated_model = tf.keras.Model(inputs=i, outputs=o)\n",
        "\n",
        "# Use `quantize_apply` to actually make the model quantization aware.\n",
        "quant_aware_model = tfmot.quantization.keras.quantize_apply(annotated_model)\n",
        "\n",
        "# For deployment purposes, the tool adds `QuantizeLayer` after `InputLayer` so that the\n",
        "# quantized model can take in float inputs instead of only uint8.\n",
        "quant_aware_model.summary()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wIGj-r2of2ls"
      },
      "source": [
        "**序贯模型示例**\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "mQOiDUGgfi4y"
      },
      "outputs": [],
      "source": [
        "# Use `quantize_annotate_layer` to annotate that the `Dense` layer\n",
        "# should be quantized.\n",
        "annotated_model = tf.keras.Sequential([\n",
        "  tfmot.quantization.keras.quantize_annotate_layer(tf.keras.layers.Dense(20, input_shape=input_shape)),\n",
        "  tf.keras.layers.Flatten()\n",
        "])\n",
        "\n",
        "# Use `quantize_apply` to actually make the model quantization aware.\n",
        "quant_aware_model = tfmot.quantization.keras.quantize_apply(annotated_model)\n",
        "\n",
        "quant_aware_model.summary()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "MpvX5IqahV1r"
      },
      "source": [
        "## 设置检查点和反序列化"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "GuZ5wlij1dcJ"
      },
      "source": [
        "**您的用例**：仅 HDF5 模型格式需要此代码（HDF5 权重或其他格式不需要）。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "6khQg-q7imfH"
      },
      "outputs": [],
      "source": [
        "# Define the model.\n",
        "base_model = setup_model()\n",
        "base_model.load_weights(pretrained_weights) # optional but recommended for model accuracy\n",
        "quant_aware_model = tfmot.quantization.keras.quantize_model(base_model)\n",
        "\n",
        "# Save or checkpoint the model.\n",
        "_, keras_model_file = tempfile.mkstemp('.h5')\n",
        "quant_aware_model.save(keras_model_file)\n",
        "\n",
        "# `quantize_scope` is needed for deserializing HDF5 models.\n",
        "with tfmot.quantization.keras.quantize_scope():\n",
        "  loaded_model = tf.keras.models.load_model(keras_model_file)\n",
        "\n",
        "loaded_model.summary()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "NeNCMDAbnEKU"
      },
      "source": [
        "## 创建并部署量化模型"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "iiYk_KR0rJ2n"
      },
      "source": [
        "通常，请参考将要使用的部署后端的文档。\n",
        "\n",
        "下面是一个 TFLite 后端的示例。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "fbBiEetda3R8"
      },
      "outputs": [],
      "source": [
        "base_model = setup_pretrained_model()\n",
        "quant_aware_model = tfmot.quantization.keras.quantize_model(base_model)\n",
        "\n",
        "# Typically you train the model here.\n",
        "\n",
        "converter = tf.lite.TFLiteConverter.from_keras_model(quant_aware_model)\n",
        "converter.optimizations = [tf.lite.Optimize.DEFAULT]\n",
        "\n",
        "quantized_tflite_model = converter.convert()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "v5raSy9ghxkv"
      },
      "source": [
        "## 试验量化"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "LUGpXIET0cy3"
      },
      "source": [
        "**您的用例**：使用以下 API 意味着没有支持的部署路径。这些功能也是实验性功能，不具备向后兼容性。\n",
        "\n",
        "- `tfmot.quantization.keras.QuantizeConfig`\n",
        "- `tfmot.quantization.keras.quantizers.Quantizer`\n",
        "- `tfmot.quantization.keras.quantizers.LastValueQuantizer`\n",
        "- `tfmot.quantization.keras.quantizers.MovingAverageQuantizer`"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Q1KI_FCcU7Yn"
      },
      "source": [
        "### 设置：DefaultDenseQuantizeConfig"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "I6nPkJDRUB2G"
      },
      "source": [
        "要进行实验，需要使用 `tfmot.quantization.keras.QuantizeConfig`，它描述了如何量化层的权重、激活和输出。\n",
        "\n",
        "以下示例定义了 API 默认值中用于 `Dense` 层的相同 `QuantizeConfig`。\n",
        "\n",
        "在此示例的正向传播过程中，以 `layer.kernel` 作为输入调用了 `get_weights_and_quantizers` 中返回的 `LastValueQuantizer`，从而产生了输出。通过 `set_quantize_weights` 中定义的逻辑，输出将替换 `Dense` 层的原始正向传播中的 `layer.kernel`。同样的构想也适用于激活和输出。\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "B9SWK5UQT7VQ"
      },
      "outputs": [],
      "source": [
        "LastValueQuantizer = tfmot.quantization.keras.quantizers.LastValueQuantizer\n",
        "MovingAverageQuantizer = tfmot.quantization.keras.quantizers.MovingAverageQuantizer\n",
        "\n",
        "class DefaultDenseQuantizeConfig(tfmot.quantization.keras.QuantizeConfig):\n",
        "    # Configure how to quantize weights.\n",
        "    def get_weights_and_quantizers(self, layer):\n",
        "      return [(layer.kernel, LastValueQuantizer(num_bits=8, symmetric=True, narrow_range=False, per_axis=False))]\n",
        "\n",
        "    # Configure how to quantize activations.\n",
        "    def get_activations_and_quantizers(self, layer):\n",
        "      return [(layer.activation, MovingAverageQuantizer(num_bits=8, symmetric=False, narrow_range=False, per_axis=False))]\n",
        "\n",
        "    def set_quantize_weights(self, layer, quantize_weights):\n",
        "      # Add this line for each item returned in `get_weights_and_quantizers`\n",
        "      # , in the same order\n",
        "      layer.kernel = quantize_weights[0]\n",
        "\n",
        "    def set_quantize_activations(self, layer, quantize_activations):\n",
        "      # Add this line for each item returned in `get_activations_and_quantizers`\n",
        "      # , in the same order.\n",
        "      layer.activation = quantize_activations[0]\n",
        "\n",
        "    # Configure how to quantize outputs (may be equivalent to activations).\n",
        "    def get_output_quantizers(self, layer):\n",
        "      return []\n",
        "\n",
        "    def get_config(self):\n",
        "      return {}"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "8vJeoGQG9ZX0"
      },
      "source": [
        "### 量化自定义 Keras 层\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "YmyhI_bzWb2w"
      },
      "source": [
        "本示例使用 `DefaultDenseQuantizeConfig` 来量化 `CustomLayer`。\n",
        "\n",
        "在“试验量化”用例中，应用的配置是相同的。\n",
        "\n",
        "- 将 `tfmot.quantization.keras.quantize_annotate_layer` 应用于 `CustomLayer` 并在 `QuantizeConfig` 中传递。\n",
        "- 通过 `tfmot.quantization.keras.quantize_annotate_model` 继续使用 API ​​默认值来量化模型的其余部分。\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "7_rBOJdyWWEs"
      },
      "outputs": [],
      "source": [
        "quantize_annotate_layer = tfmot.quantization.keras.quantize_annotate_layer\n",
        "quantize_annotate_model = tfmot.quantization.keras.quantize_annotate_model\n",
        "quantize_scope = tfmot.quantization.keras.quantize_scope\n",
        "\n",
        "class CustomLayer(tf.keras.layers.Dense):\n",
        "  pass\n",
        "\n",
        "model = quantize_annotate_model(tf.keras.Sequential([\n",
        "   quantize_annotate_layer(CustomLayer(20, input_shape=(20,)), DefaultDenseQuantizeConfig()),\n",
        "   tf.keras.layers.Flatten()\n",
        "]))\n",
        "\n",
        "# `quantize_apply` requires mentioning `DefaultDenseQuantizeConfig` with `quantize_scope`\n",
        "# as well as the custom Keras layer.\n",
        "with quantize_scope(\n",
        "  {'DefaultDenseQuantizeConfig': DefaultDenseQuantizeConfig,\n",
        "   'CustomLayer': CustomLayer}):\n",
        "  # Use `quantize_apply` to actually make the model quantization aware.\n",
        "  quant_aware_model = tfmot.quantization.keras.quantize_apply(model)\n",
        "\n",
        "quant_aware_model.summary()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "vnMguvVSnUqD"
      },
      "source": [
        "### 修改量化参数\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "BLgH1aFMjTK4"
      },
      "source": [
        "**常见误区**：将偏差量化为少于 32 位通常会严重影响模型准确率。\n",
        "\n",
        "本示例将 `Dense` 层修改为将 4 位用于其权重，而不是默认的 8 位。模型的其余部分继续使用 API 默认值。\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "77jgBjccnTh6"
      },
      "outputs": [],
      "source": [
        "quantize_annotate_layer = tfmot.quantization.keras.quantize_annotate_layer\n",
        "quantize_annotate_model = tfmot.quantization.keras.quantize_annotate_model\n",
        "quantize_scope = tfmot.quantization.keras.quantize_scope\n",
        "\n",
        "class ModifiedDenseQuantizeConfig(DefaultDenseQuantizeConfig):\n",
        "    # Configure weights to quantize with 4-bit instead of 8-bits.\n",
        "    def get_weights_and_quantizers(self, layer):\n",
        "      return [(layer.kernel, LastValueQuantizer(num_bits=4, symmetric=True, narrow_range=False, per_axis=False))]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "x9JDKhaU3FKe"
      },
      "source": [
        "在“试验量化”用例中，应用的配置是相同的。\n",
        "\n",
        "- 将 `tfmot.quantization.keras.quantize_annotate_layer` 应用于 `Dense` 层并在 `QuantizeConfig` 中传递。\n",
        "- 通过 `tfmot.quantization.keras.quantize_annotate_model` 继续使用 API ​​默认值来量化模型的其余部分。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "sq5mfyBF3KxV"
      },
      "outputs": [],
      "source": [
        "model = quantize_annotate_model(tf.keras.Sequential([\n",
        "   # Pass in modified `QuantizeConfig` to modify this Dense layer.\n",
        "   quantize_annotate_layer(tf.keras.layers.Dense(20, input_shape=(20,)), ModifiedDenseQuantizeConfig()),\n",
        "   tf.keras.layers.Flatten()\n",
        "]))\n",
        "\n",
        "# `quantize_apply` requires mentioning `ModifiedDenseQuantizeConfig` with `quantize_scope`:\n",
        "with quantize_scope(\n",
        "  {'ModifiedDenseQuantizeConfig': ModifiedDenseQuantizeConfig}):\n",
        "  # Use `quantize_apply` to actually make the model quantization aware.\n",
        "  quant_aware_model = tfmot.quantization.keras.quantize_apply(model)\n",
        "\n",
        "quant_aware_model.summary()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "bJMKgzh84CCs"
      },
      "source": [
        "### 修改要量化的层的部分\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Z3pij2uO808g"
      },
      "source": [
        "本示例将 `Dense` 层修改为跳过量化激活。模型的其余部分继续使用 API 默认值。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "6BaaJPBR8djV"
      },
      "outputs": [],
      "source": [
        "quantize_annotate_layer = tfmot.quantization.keras.quantize_annotate_layer\n",
        "quantize_annotate_model = tfmot.quantization.keras.quantize_annotate_model\n",
        "quantize_scope = tfmot.quantization.keras.quantize_scope\n",
        "\n",
        "class ModifiedDenseQuantizeConfig(DefaultDenseQuantizeConfig):\n",
        "    def get_activations_and_quantizers(self, layer):\n",
        "      # Skip quantizing activations.\n",
        "      return []\n",
        "\n",
        "    def set_quantize_activations(self, layer, quantize_activations):\n",
        "      # Empty since `get_activaations_and_quantizers` returns\n",
        "      # an empty list.\n",
        "      return"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "2OkqHX5r2nT7"
      },
      "source": [
        "在“试验量化”用例中，应用的配置是相同的。\n",
        "\n",
        "- 将 `tfmot.quantization.keras.quantize_annotate_layer` 应用于 `Dense` 层并在 `QuantizeConfig` 中传递。\n",
        "- 通过 `tfmot.quantization.keras.quantize_annotate_model` 继续使用 API ​​默认值来量化模型的其余部分。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Ln9MDIZJ2n3F"
      },
      "outputs": [],
      "source": [
        "model = quantize_annotate_model(tf.keras.Sequential([\n",
        "   # Pass in modified `QuantizeConfig` to modify this Dense layer.\n",
        "   quantize_annotate_layer(tf.keras.layers.Dense(20, input_shape=(20,)), ModifiedDenseQuantizeConfig()),\n",
        "   tf.keras.layers.Flatten()\n",
        "]))\n",
        "\n",
        "# `quantize_apply` requires mentioning `ModifiedDenseQuantizeConfig` with `quantize_scope`:\n",
        "with quantize_scope(\n",
        "  {'ModifiedDenseQuantizeConfig': ModifiedDenseQuantizeConfig}):\n",
        "  # Use `quantize_apply` to actually make the model quantization aware.\n",
        "  quant_aware_model = tfmot.quantization.keras.quantize_apply(model)\n",
        "\n",
        "quant_aware_model.summary()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "yD0sIR6tmmRx"
      },
      "source": [
        "### 使用自定义量化算法\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "I4onhF-H1zsn"
      },
      "source": [
        "`tfmot.quantization.keras.quantizers.Quantizer` 类是一个可调用对象，可以将任何算法应用于其输入。\n",
        "\n",
        "在本示例中，输入是权重，我们将 `FixedRangeQuantizer` __call__ 函数中的数学运算应用于权重。现在，`FixedRangeQuantizer` 的输出将代替原始权重值传递给使用这些权重的任何对象。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Jt8UioZH49QV"
      },
      "outputs": [],
      "source": [
        "quantize_annotate_layer = tfmot.quantization.keras.quantize_annotate_layer\n",
        "quantize_annotate_model = tfmot.quantization.keras.quantize_annotate_model\n",
        "quantize_scope = tfmot.quantization.keras.quantize_scope\n",
        "\n",
        "class FixedRangeQuantizer(tfmot.quantization.keras.quantizers.Quantizer):\n",
        "  \"\"\"Quantizer which forces outputs to be between -1 and 1.\"\"\"\n",
        "\n",
        "  def build(self, tensor_shape, name, layer):\n",
        "    # Not needed. No new TensorFlow variables needed.\n",
        "    return {}\n",
        "\n",
        "  def __call__(self, inputs, training, weights, **kwargs):\n",
        "    return tf.keras.backend.clip(inputs, -1.0, 1.0)\n",
        "\n",
        "  def get_config(self):\n",
        "    # Not needed. No __init__ parameters to serialize.\n",
        "    return {}\n",
        "\n",
        "\n",
        "class ModifiedDenseQuantizeConfig(DefaultDenseQuantizeConfig):\n",
        "    # Configure weights to quantize with 4-bit instead of 8-bits.\n",
        "    def get_weights_and_quantizers(self, layer):\n",
        "      # Use custom algorithm defined in `FixedRangeQuantizer` instead of default Quantizer.\n",
        "      return [(layer.kernel, FixedRangeQuantizer())]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "lu5ZeJ_Y2UxW"
      },
      "source": [
        "在“试验量化”用例中，应用的配置是相同的。\n",
        "\n",
        "- 将 `tfmot.quantization.keras.quantize_annotate_layer` 应用于 `Dense` 层并在 `QuantizeConfig` 中传递。\n",
        "- 通过 `tfmot.quantization.keras.quantize_annotate_model` 继续使用 API ​​默认值来量化模型的其余部分。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "ItC_3mwT2U87"
      },
      "outputs": [],
      "source": [
        "model = quantize_annotate_model(tf.keras.Sequential([\n",
        "   # Pass in modified `QuantizeConfig` to modify this `Dense` layer.\n",
        "   quantize_annotate_layer(tf.keras.layers.Dense(20, input_shape=(20,)), ModifiedDenseQuantizeConfig()),\n",
        "   tf.keras.layers.Flatten()\n",
        "]))\n",
        "\n",
        "# `quantize_apply` requires mentioning `ModifiedDenseQuantizeConfig` with `quantize_scope`:\n",
        "with quantize_scope(\n",
        "  {'ModifiedDenseQuantizeConfig': ModifiedDenseQuantizeConfig}):\n",
        "  # Use `quantize_apply` to actually make the model quantization aware.\n",
        "  quant_aware_model = tfmot.quantization.keras.quantize_apply(model)\n",
        "\n",
        "quant_aware_model.summary()"
      ]
    }
  ],
  "metadata": {
    "colab": {
      "collapsed_sections": [
        "Tce3stUlHN0L"
      ],
      "name": "training_comprehensive_guide.ipynb",
      "toc_visible": true
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
